3 research outputs found
On the special role of class-selective neurons in early training
It is commonly observed that deep networks trained for classification exhibit
class-selective neurons in their early and intermediate layers. Intriguingly,
recent studies have shown that these class-selective neurons can be ablated
without deteriorating network function. But if class-selective neurons are not
necessary, why do they exist? We attempt to answer this question in a series of
experiments on ResNet-50s trained on ImageNet. We first show that
class-selective neurons emerge during the first few epochs of training, before
receding rapidly but not completely; this suggests that class-selective neurons
found in trained networks are in fact vestigial remains of early training. With
single-neuron ablation experiments, we then show that class-selective neurons
are important for network function in this early phase of training. We also
observe that the network is close to a linear regime in this early phase; we
thus speculate that class-selective neurons appear early in training as
quasi-linear shortcut solutions to the classification task. Finally, in causal
experiments where we regularize against class selectivity at different points
in training, we show that the presence of class-selective neurons early in
training is critical to the successful training of the network; in contrast,
class-selective neurons can be suppressed later in training with little effect
on final accuracy. It remains to be understood by which mechanism the presence
of class-selective neurons in the early phase of training contributes to the
successful training of networks
Knowledge Graph Completion Models are Few-shot Learners: An Empirical Study of Relation Labeling in E-commerce with LLMs
Knowledge Graphs (KGs) play a crucial role in enhancing e-commerce system
performance by providing structured information about entities and their
relationships, such as complementary or substitutable relations between
products or product types, which can be utilized in recommender systems.
However, relation labeling in KGs remains a challenging task due to the dynamic
nature of e-commerce domains and the associated cost of human labor. Recently,
breakthroughs in Large Language Models (LLMs) have shown surprising results in
numerous natural language processing tasks. In this paper, we conduct an
empirical study of LLMs for relation labeling in e-commerce KGs, investigating
their powerful learning capabilities in natural language and effectiveness in
predicting relations between product types with limited labeled data. We
evaluate various LLMs, including PaLM and GPT-3.5, on benchmark datasets,
demonstrating their ability to achieve competitive performance compared to
humans on relation labeling tasks using just 1 to 5 labeled examples per
relation. Additionally, we experiment with different prompt engineering
techniques to examine their impact on model performance. Our results show that
LLMs significantly outperform existing KG completion models in relation
labeling for e-commerce KGs and exhibit performance strong enough to replace
human labeling